Hierachical image model adaptation

ABSTRACT

The invention relates to a method for processing digitized image data by adapting image adaptation models. Said method involves: furnishing a hierarchical structure graph with nodes respectively representing at least one parametrized image adaptation model; a predetermined amount of superimposed planes, wherein at least one node is located on each plane; edges connecting pairwise predetermined nodes of different planes and defining a father node for each node pair as the node in the lower plane and a son node as the node in the upper plane; applying the structure graphs on the image data by processing at least one node beginning with the lowest plane, wherein processing of a node involves the following steps: adapting its at least one image adaptation model to the image data by varying model parameters; determining a degree of adaptation for every parameter variation as a measure of the quality of image adaptation and determining an evaluation for each parameter variation taking into account the at least one degree of adaptation already determined and wherein the evaluation already determined for each parameter variation is used as a criterion for the processing of a son node of the processed node. If this criterion is met, processing of the son node through predetermined parameters of the father node is started with the initialization of the at least one adaptation model of said son node.

FILED OF THE INVENTION

The invention relates to a method of processing pictures by matching ofpicture matching models, and in particular a method which is suitablefor recognising pictures, i.e. recognising individual objects in apicture, for the analysis of scenes, in particular for the recognitionand assignment of objects, for the control of system components, e.g.avatars, and for picture compression and decompression.

Here, the expression “picture recognition” is taken to have a wideencompassing meaning. In particular, the term “picture recognition”should include the identification of an object in a picture through thecomparison with reference pictures and the classification of objectspresent in a picture.

STATE OF THE ART

In digital picture processing a range of methods exist which enable therecognition of individual objects in pictures. One example of this isthe so-called “template matching” method which looks for an object usinga simple copy of an image of the object.

Another method known from technology is the so-called “graph matching”method which is described in the German patent specification DE 4406020.

A method of automated recognition of one or more structures in digitisedpicture data is also described in the publication DE 19837004.

A disadvantage of the methods known in technology is that simplematching methods which can be carried out with a comparatively moderateamount of computation are less flexible and soon reach their limits,whereas more powerful and more flexible methods are associated with avery high amount of computation.

DESCRIPTION OF THE INVENTION

Accordingly, the object of this invention is to provide a more efficientmethod for picture processing which can be used with flexibility withsimply structured pictures as well as with complex picture scenes.

This object is solved by the method described in Claim 1.

A method of processing digitised picture data by matching of picturematching models is provided, whereby the method comprises the followingsteps: (i) provision of a hierarchical structure graph with nodes, whicheach at least represent a parameterised picture matching model, aspecified number of levels arranged above each other, whereby at leastone node is present in each level, edges which link pairs ofpredetermined nodes of different levels and, for each pair of nodes, afather node is defined as the node in the lower level and a son node asthe node in the upper level; (ii) application of the structure graph tothe picture data, in which, starting with the lowermost level, at leastone node is processed, whereby the processing of a node includes thesteps: matching of its at least one picture matching model to thepicture data by variation of the model parameters, determination of adegree of matching for each parameter variation as a measure for thequality of the picture matching and determination of an assessment foreach parameter variation, taking into consideration the at least onedetermined matching measure, and whereby the assessment determined foreach parameter variation is used as the criterion for the processing ofa son node of the processed node and, if the criterion is fulfilled, theprocessing of the son node with the initialisation of its at least onematching model through predetermined parameters of the father node.

A structure graph therefore consists of a quantity of nodes whichrepresent picture matching models and their associated picture matchingmethods and edges, which each define a pair of father/son nodes and ineach case a processing sequence for it.

A significant advantage of the method according to the invention isthat, due to the application of a suitably provided structure graph, themethod according to the invention enables an efficient combination ofhighly different picture matching models and associated methods for theprocessing of digitised picture data. This means that simple picturematching models may be used on the lower levels of the structure graph,enabling initial conclusions to be drawn about the position of objectsin the pictures corresponding to the picture data. To this end, methodsmay be applied which require a comparatively low amount of computation,such as, for example, simple differential imaging methods. Other methodsare, for example, based on the assessment of the shape and geometry andthe relationship of objects to one another, on colour classifications,template matching, Hough transformations, methods of segmentation (e.g.region-growing approaches), the use of stereo information (e.g.disparity estimation), the extraction and description of textures or theapplication of neuronal methods for the classification of pictureregions.

The matching of a picture matching model occurs through the variation ofthe parameters of the picture matching model. These parameters include,for example, translation (position), scaling and rotation (orientation)in the picture level, but also local changes in the models.

A rule is also assigned to each picture matching model with which ameasure for the quality of the matching of the picture matching model tothe picture data to be processed may be determined.

It must be noted that the application of the father/son relationship isonly used for the clarification of the processing sequence, but isgenerally not unambiguous, because not only may each father node possessa number of son nodes, but also each son node may possess a number offather nodes in the structure graph.

With a development of the method according to the invention particularlypreferred for the recognition of objects, a structure graph is providedwhich includes exactly one node in each level, whereby the noderepresents at least one picture matching model and a lower thresholdvalue and/or an upper threshold value is assigned to specified nodes. Inthis development the method terminates with the result that no object ofthe specified object class is recognised if, for a node, the assessmentfor each parameter variation lies below a lower threshold value assignedto the node, or that at least one object of the specified object classis recognised if, for a node, the assessment of at least one parametervariation lies above the upper threshold value assigned to the node orif the end node is reached.

This development may be applied particularly advantageously in theidentification of persons. In comparison to conventional methods it maybe operated significantly faster because more efficient evaluation ofthe picture data may be carried out in that the evaluation starts on acoarse scale on the lowermost level of the structure graph and theparameters found are used for the evaluation of details on higherhierarchical levels, e.g. as initial values for the relevant matchingprocesses.

With another development of the method according to the invention thestructure graph, for various orientations and/or arrangements ofelements of an object, exhibits at least one node with at least onepicture matching model for the orientation to be recognised andspecified nodes exhibit an upper and/or lower threshold value for theassessment of the parameter variations. The implementation of the methodoccurs such that the processing of son nodes of those nodes processed iswaived, for which the assessment of each parameter variation lies belowthe lower threshold value assigned to the relevant node, with the resultthat the corresponding orientations and/or arrangements of the elementsof the object are not present in the picture; that son nodes areprocessed of those processed nodes for which the assessment for at leastone parameter variation lies between the upper and lower assignedthreshold values of the relevant node; and that the processing of sonnodes is waived for the parameter variations, the assessment of whichlies above the assigned upper threshold value of the relevant node, withthe result that the orientation and/or arrangement of the elements ofthe object is classified as present in the picture which receives thebest assessment on the highest fully processed level.

This further development is especially advantageous in the recognitionof the orientation of an object of an object class in a picture. Inparticular, this further development may be employed advantageously inestimating poses and differentiating between different poses.

An alternative advantageous further development of the method accordingto the invention enables the recognition of objects of different objectclasses and of the arrangement of objects in a picture and mayconsequently be advantageously employed in scene analysis. Here, thestructure graph for each object class comprises at least one node withat least one picture matching model for the object class, wherebyspecified nodes exhibit a lower and/or upper threshold value for theassessment of the parameter variations. With this further developmentthe processing of son nodes is waived for those nodes for which theassessment of each parameter variation lies below the lower thresholdvalue defined for the relevant node with the result that the associatedobject is classified as not being present in the picture; son nodes ofthose processed nodes are processed for which the assessment liesbetween the lower and upper threshold values for the relevant node forat least one parameter variation; and the processing of son nodes forthe parameter variations is waived, the assessment of which lies abovethe upper threshold value assigned to the relevant node with the resultthat the associated object is classified as being present in thepicture.

A structure graph suitable for a scene analysis is generally a complexformation, whereby for one end node (i.e. a node on the uppermost levelof the structure graph) there are typically a number of paths, wherebythe term “path” designates a sequence of nodes having a father/sonrelationship.

This means that there are nodes in the graph which possess more than onefather node. This enables single objects to be reused as parts of othercomplex objects. Consequently, this produces not just a meagrerepresentation of knowledge about objects in structure graphs, butrather the evaluation process profits from it, because the same objectparts are no longer in competition with one another in differentcontexts.

In order to generate a description of a complex scene, a structure graphis used, the paths of which terminate at end nodes, the picture matchingmodels of which represent different types of objects. The lowermostlayers in the structure graph contain picture matching models whichdifferentiate between the objects according to size, orientation andcoarse structure. In the following layers the objects are subdividedinto different classes in order to finally differentiate according toall or a large part of their definite features at the end nodes.

The evaluation process first processes the picture matching models ofthe lowermost layer using the method assigned to it. Here, the roughpositions of the objects are determined. With the decision for aparameter variation of the picture matching models of a node, the methodfavours initially objects of the corresponding object class with thefeatures defined by the parameter variation. Further evaluation occursin the next stage with the processing of the picture matching models ofthe son nodes of the processed node with the best assessment for aparameter variation when the assessment lies above a specified lower andbelow a specified upper threshold for the node. With this processingstage part of the parameters is used to define the initial values, inparticular the position, for the picture matching models and to restrictthe possible variations of the parameters of these picture matchingmodels.

The result of the evaluation is a set of parameter variations whichbelong to objects recognised in the picture. Here, the parametersdetermine the position and other properties of the picture matchingmodels, such as for example the size.

In a further, particularly advantageous development of the methodaccording to the invention a lower threshold value is assigned to eachnode of the structure graph and the processing of son nodes of a node iswaived if each parameter variation produces an assessment below thethreshold value defined for the node.

This means that a few matching stages which look promising areprematurely interrupted, leading to faster execution of the method. Ifthe models employed permit it, the matching dimensions and theassessments are defined such that the assessments of different nodes maybe directly compared with one another. In this case in the lattermentioned further development a universal threshold value may bespecified for all nodes.

In a preferred further development of the method according to theinvention the lower and/or the upper threshold values may be adapteddynamically.

Consequently, the thresholds, for example, due to results for apreceding picture in a sequence may be modified to express a certainexpectation regarding the chronological course of the picture sequence,in particular the recognition of the object already detected, in thenext picture.

Particularly advantageously, parameters of the processed father nodesmay be accepted at least partially in the method according to theinvention for the processing of son nodes.

For example, on a lower level of the structure graph the outline of anobject may be determined whereupon the corresponding parameters areaccepted into the models of the son nodes. The term “accepted” maysignify that the parameter values directly enter a matching modelwithout being newly released there for variation, but also that they actas initial values of the matching model or the initial values arecalculated from them. The parameters of the father nodes may also beadvantageously applied in that they define limits in which appropriateparameters of the picture matching models of the son nodes may bevaried. In all of these cases the amount of computation is substantiallyreduced and, of course, most significantly in the first case.

In an advantageous further development the appropriate assessment of theparameter variations for picture matching models of the father node istaken into account in the assessment of the parameter variations forpicture matching models of son nodes. This is particularly of advantagewhen at the highest processed level in each case different matchingprocesses lead to closely adjacent assessments, involving the risk that,on continuing, the method runs to a sub-optimum solution which does notcorrespond to the best possible result.

In a further development of the method according to the inventionweighting values, which enter into the assessment, are assigned to eachpicture matching model and/or each edge.

The weighting of the picture matching models and/or of the edges may bepracticable when a certain expectation of a picture to be processed ispresent, and/or when a node encompasses several picture matching modelswhich have different informatory values. When a node exhibits a numberof son nodes, the weighting of the edges may also be applied to specifya processing sequence of the son nodes, which, with a suitable selectionof threshold values (interruption criteria), may help to avoid asuperfluous amount of computation. Also, combinations which offer moreinformation or are more plausible of different nodes may reasonably betaken into account, because these weighting factors may act as apredetermined measure of which combination of picture matching models ofnodes on different levels is attributed a particularly high level ofinformatory value.

With a particularly preferred further development of all previouslymentioned methods, the picture matching models of the nodes of theuppermost level are based on digitised reference picture data and/or thepicture matching models of predetermined nodes are based on the picturematching models of their son nodes.

This means that in a simple manner a hierarchy in the complexity of thepicture matching models may be achieved, whereby for example the picturematching models of the end nodes correspond to detailed portraits,whereas picture matching models are built up more simply, i.e. forexample, they exhibit fewer parameters the deeper the level is located.Such simplified picture matching models may be quickly adapted, wherebya quick and effective evaluation is ensured in the lower levels, whichin turn has a positive effect on the overall efficiency of the method.

With particular preference, the picture matching models of the nodes ofthe structure graph encompass graphs of features, so-called modelgraphs, which consist of a two-dimensional arrangement of model graphnodes and edges. Here, features which contain information about picturedata are assigned to the model graph nodes. The model graph edges codethe relative arrangement of the model graph nodes.

Particularly preferred model graphs are so-called reference graphs, thefeatures of which are the results of the application of a set offilters, e.g. Gabor filters, on the digitised data of comparativepictures. Here, the features may be the results of the application of aset of filters to picture data which itself may originate from differentcomparative pictures.

The set of filters may be obtained using scaling and rotation from anoriginal filter. The scaling of the filters, from which the features areobtained, is preferably smaller from hierarchical stage to hierarchicalstage, whereby the frequency increases or the resolution becomes morerefined. These types of feature are termed jets.

With the matching of the picture matching model graphs the similarity ofthe jets of each picture matching model graph node k is calculated withthe corresponding jets of the current picture. In addition, a number offeatures m, which may be generated from different pictures or a previouslearning process for the picture matching model features, may beassigned to each picture matching model graph node. For each picturematching model graph node the similarity of its jet to the jet from thecurrent picture is calculated for a certain position. In this respect,the jets of each reference graph j(k,m) are compared to the jets fromthe current picture {tilde over (j)}(k).

In the most general case a descriptive measure of matching is producedfor the parameter variations of the picture matching models for theoverall similarity between the picture matching model graph and thepicture according to the formula:S ^((n)) =f ^((n))(P _(k) ^((n)) j(k,m),P _(k) ^((n)) {tilde over(j)}(k),d ⁽⁰⁾(k,m),d ^((n))(k,m)),k ε K′(n) ⊂ K,whereby

-   -   j(k,m) is the jet of the picture matching model m on the node k,    -   {tilde over (j)}(k) is the jet of the picture on the position of        the node k,    -   d⁽⁰⁾(k,m) is the original position of the node k of the picture        matching model m,    -   d^((n))(k,m) is the position of the node k of the picture        matching model m in step n,    -   f^((n))( . . . ) is a functional of the picture matching model        jet and the picture jet at corresponding locations,    -   P_(k) ^((n)) represents an image of the jet j(k,m) or {tilde        over (j)}(k), and    -   K′(n) is a subset of the set K of all graph nodes k.

The original position and its change is used for the computation of thetopological costs incurred by too strong a deformation of the picturematching model. The step parameter n here indicates that the computationof the overall similarity varies both during the individual phases ofthe matching process for a picture matching model and also for picturematching model to picture matching model. In particular, the matchingprocess is subdivided into a number of phases, such as for example thecoarse positioning, resealing and fine matching of the picture matchingmodel. In each phase the overall similarity is calculated in an adequatemanner.

In a preferred form the similarity of the graph is chosen in step n as:S ^((n)) =f ₂ ^((n))(k,f ₁ ^((n))(n,s(P _(k) ^((n)) j(k,m),P _(k) ^((n)){tilde over (j)}(k)),d ⁽⁰⁾(k,m),d ^((n))(k,m)))

Here f₁ ^((n))(n,s(P_(k) ^((n)) j(k,m),P_(k) ^((n)){tilde over(j)}(k)),d⁽⁰⁾(k,m),d^((n))(k,m)) is the similarity of the picture jet tothe picture matching model jets or the submodel jets for step n.

Via f₂ ^((n))(k, . . . ) an overall measure is obtained over all nodes.In this connection topological costs may in particular be taken intoaccount. These form a measure of the local deformation of the modelgraph which arises during the matching through the displacement of themodel graph nodes with respect to one another.

In further preferred forms a summation, weighted summation or meanformation is chosen both for f₁ as well as for f₂. (With f₁ as the sumover the picture matching models or picture matching model subjets, withf₂ as the sum over the nodes.)

In other constellations it has also proven to be advantageous to use anordering operation such as the median or the “trimmed mean” for f₁ orf₂.

In a particularly preferred form the jet similarity of the graph may bechosen in step n as:$S^{(n)} = {{\sum\limits_{k \in {K^{\prime}{(n)}}}{{l_{m}^{th}(n)}{s\left( {{P_{k}^{(n)}{\underset{\_}{j}\left( {k,m} \right)}},{P_{k}^{(n)}{\underset{\_}{\overset{\sim}{j}}(k)}}} \right)}}} + {f^{(n)}\left( {{{\underset{\_}{d}}^{(0)}\left( {k,m} \right)},{{\underset{\_}{d}}^{(n)}\left( {k,m} \right)}} \right)}}$

Here l_(m) ^(th)(n) designates an ordering operation on the m jetsimilarities s(P^((n))j(k,m),P^((n)) {tilde over (j)}(k)), e.g. with l=1the maximum of the m similarities.

The changes in the topology of the picture matching models are takeninto account with f^((n))(d⁽⁰⁾(k,m),d^((n))(k,m)).

In a further indicated form P^((n)) is a function of jets whichtransforms the vector j into a vector j′, whereby the components of j′are a subset of the components of j. This function may however vary overthe nodes. This is particularly of benefit if the approximate positionof the model on the picture region is to be found through the selectionof the low frequency portions on a node, while at the same time highfrequency portions are used on other significant nodes to increase thesensitivity of the localisation and detection resolution.

If the jets are represented in the form of amplitudes (a_(i)) and phases(p_(i)), then the feature similarity s(j,{tilde over (j)}) may becalculated in preferred embodiments according to one of the followingformulas:${s\left( {\underset{\_}{j},\overset{\sim}{\underset{\_}{j}}} \right)} = \frac{\sum\limits_{i}{{a_{i}(m)}a_{i}}}{\sqrt{\sum\limits_{i}{{a_{i}(m)}{a_{i}(m)}{\sum\limits_{i}{\overset{\sim}{a_{i}}{\overset{\sim}{a}}_{i}}}}}}$${s\left( {\underset{\_}{j},\underset{\_}{\overset{\sim}{j}}} \right)} = \frac{\sum\limits_{i}{{a_{i}(m)}{\overset{\sim}{a}}_{i}{\cos\left( {{p_{i}(m)} - {\overset{\sim}{p}}_{i}} \right)}}}{\sqrt{\sum\limits_{i}{{a_{i}(m)}{a_{i}(m)}{\sum\limits_{i}{{\overset{\sim}{a}}_{i}{\overset{\sim}{a}}_{i}}}}}}$${s\left( {\underset{\_}{j},\underset{\_}{\overset{\sim}{j}}} \right)} = {\frac{\sum\limits_{i}{{a_{i}(m)}{\overset{\sim}{a}}_{i}{\cos\left( {{p_{i}(m)} - {\overset{\sim}{p}}_{i} - {\underset{\_}{dk}}_{i}} \right)}}}{\sqrt{\sum\limits_{i}{{a_{i}(m)}{a_{i}(m)}{\sum\limits_{i}{{\overset{\sim}{a}}_{i}{\overset{\sim}{a}}_{i}}}}}}.}$

In the last formula d designates the disparity between the model andpicture jets. To determine s, d is varied such that s is a maximum. kdesignates here the position of the i-th filter in the Fourier space.

In a further preferred form, the vectors on the nodes may not justrepresent jets in the conventional sense, which are based on a Gabortransformation, but rather also so-called compound jets, the componentsof which also represent non-homogeneous representations of a region,e.g. edge information in a component and similarity to a colour inanother.

Furthermore the combination of a number of these methods (i.e. of thepicture matching model associated with the methods) within a node of thestructure graph is practicable, because the results of the individualmethods in the computation of an overall assessment (i.e. quality ofmatching) for the node permits a more exact assessment of the picturedata used as a basis than would be the case with each method takenalone.

A special form of the method described here is the bunch graph matchingwhich may be represented as:$S = {\sum\limits_{k \in K}{\max\limits_{m}\quad{s\left( {{\underset{\_}{j}\left( {k,m} \right)},{\overset{\sim}{\underset{\_}{j}}(k)}} \right)}}}$

In a possible embodiment the picture matching models on the nodes eachpossess a set of the same features, so-called bundled jets. This bundlegenerally represents a special aspect of an object as a set of featuresobtained from a set of individual characteristics. If such a picturematching model represents a face, then the right eye, left eye, mouthand nose are special aspects. They are each represented by a set offeatures obtained from the corresponding aspects of a set of comparativepictures, which each show the face of another person with possiblyanother facial expression.

The number of features needed to cover a representative part of thedifferent characteristics and therefore to obtain a sufficiently generalrepresentation of the aspect varies. This is on one hand the case fromnode to node of the picture matching model, because the object generallyhas simple aspects which are very similar for all individuals, and atthe same time has other more complex aspects which differ significantlyfrom individual to individual.

With the application of filters of different sizes the dependence of theresolution is also included. Consequently, generally a fewrepresentatives are sufficient for the features obtained from the coarsefilters, whereas the same aspects for the features of the fine resolvingfilters vary more significantly and therefore more features of differentindividuals are needed in order to achieve the same general validity ofthe representation than is the case for the coarsely resolving filtersand their features. The number of features in the bundles of the picturematching model nodes reduces accordingly when the resolution of thefilters is reduced from the end nodes towards the start node.

The classical representation with only one picture matching model cannotprofit from this fact, because the features include all filters fromcoarse to fine resolution. Preferably the structure graph is set up suchthat the features include combinations of different types of jets. Withregard to the evaluation of the structure graph, it is practicable ifthe individuals, from which the features are obtained and whichcontribute to the optimum matching of a picture matching model on alevel, change from level to level.

This enables a very compact representation through the structure graph,because with the classical graph matching, features are always neededwhich contain all the applied filters with their various resolutions. Inthis respect a bundle would need in each case to contain an appropriatefeature for all possible combinations of the features for the individualresolutions in order to achieve the same generality as the describedrepresentation via the structure graph. Apart from the management of thevery large number of features, which are needed according to the laws oncombination theory, the procurement of the data for all theseindividuals renders this approach very cumbersome, if not unusable.

The described method is very suitable in a further development forpicture compression and decompression.

Picture compression comprises the steps: compression of each recognisedobject with a given compression factor for the appropriate object class,whereby the control of the parameters of the compression method is basedon the parameters from the results of the scene analysis and compressionof the picture region (background) not occupied by objects using ahigher given compression factor.

During the compression the picture is segmented into single objectsaccording to the above described scene analysis. These objects have herealready been broken down into their constituent parts by the picturematching model along the associated path in the structure graph. Theinformation from the segmentation and the breaking down of the objectsmay be used for the compression in many ways:

-   -   Through the control of the parameters of a conventional        compression method, it is ensured that the “interesting” objects        may be reproduced with good quality after the decompression,        whereas the “uninteresting” regions of the picture are more        substantially compressed, whereby the losses in quality        resulting from this are not a disturbance in the reproduction.    -   If the structure graph is available during the decompression,        then an identifier for the path in the structure graph which has        been found for an object may be transferred. During the        decompression a type of phantom object may then be generated        based on the knowledge via the path in the structure graph. In        addition, the information may be coded which enables the phantom        object to adapt to the actual object. In this way only a very        compact code for the object class (the path in the structure        graph) and the information about the variation of the actual        object compared to its class representative need to be        transferred.    -   Instead of retaining the complete structure graph for the        decompression, the relevant part may be initially coded. During        the compression of picture sequences, the part of the structure        graph needed for an object only needs to be transferred once and        then the appropriate part may be coded again by a short        identifier.

The picture decompression of this sort of compressed picture occurs bythe reverse of the procedure used for the compression.

As already described under picture compression/decompression, theinformation obtained about the picture content may also be used in orderto replace the actual objects in the picture by the representatives oftheir object class.

The preferred development of the appropriate method includes the steps:provision of reference pictures of the object representatives,substitution of the at least one selected recognised object by theobject representative, whereby part of the parameters of the picturematching models may be applied to the control of the objectrepresentatives.

This technique may also be applied to use any placeholders, so-calledavatars, instead of the representatives. These avatars may be controlledthrough the processing of picture sequences and the tracking of objectswhich it produces and their intrinsic movements. This technique is usedin video telephony, 3D internet chat, in trick-film techniques and inthe control of virtual figures of interactive software, such as forexample a game or a virtual museum.

Similarly, advantageous further developments of the method according tothe invention may be used to replace the background in a picture by adifferent background in that at least one reference picture is providedfor the other background and then at least one object recognised by themethod according to the invention is inserted into the referencepicture.

Here, combinations of the latter mentioned methods are also possible sothat—starting from a real picture with object(s) and background—anartificial picture is created, whereby at least one of these objects isreplaced by an object representative and the background by a differentbackground.

In order to be able to visualise such pictures more easily without ineach case having to process, save or transfer the entire information, apreferred further development of the method according to the inventionmakes available a data base with the object representatives and/or thereference pictures for the background. In particular for the transfer ofsuch processed pictures, it proves useful if the data base is also madeavailable on the receiver side.

In particular, a further development is suitable for all types oftrick-film techniques, in which the object representatives include realobjects and/or virtual objects.

This enables almost any desired scenes to be composed.

A particularly advantageous further development of the method accordingto the invention may be used for processing the individual pictures of apicture sequence. To do this, the parameters of the picture matchingmodels are allocated initial values which use part of the parametersfrom the processing of previous pictures.

In comparison to the processing of individual pictures carried outindependently from one another, a substantial speeding up of theprocessing procedure may be achieved in this way.

Preferably, the possible variations of the parameters of the picturematching models based on a part of the parameters from the processing ofprevious pictures are restricted, because in this way the number of thecomputation operations to be carried out may be further restricted. Thisis mainly practicable when chronologically sequential pictures of apicture scene which changes relatively little are to be processed.

In the following particular embodiments of the invention are explainedwith reference to the accompanying figures, in which:

FIG. 1: shows a preferred method of object recognition,

FIG. 2: shows a preferred method of estimating a pose,

FIGS. 3A/3B: show picture matching models used for a picture analysis ondifferent hierarchical levels,

FIGS. 4A/4B: show a picture analysis carried out with the picturematching models illustrated in FIGS. 3A/3B.

OBJECT RECOGNITION

In this embodiment the structure graph consists of a chain of nodes,each of which is connected by aligned edges.

The object of the evaluation process here is the efficient selection ofsuitable parameters for all picture matching models in the structuregraph. This includes in particular the position of the object in thepicture.

In a particularly effective embodiment the picture matching models ofthe nodes represent picture information, the degree of detail of whichincreases from level to level in the direction of the end node. Thedegree of detail may here vary both in the complexity of the picturematching model and/or in the resolution of the representation of thepicture data.

In order, for example, to be able to recognise a person in a picturecompletely with body, head, arms and legs, a structure graph is providedwhich in the node on the lowermost level exhibits a very simple picturematching model which has just sufficient information about the picturedata to acquire the rough alignment of a person as a whole. At the nexthigher levels the corresponding picture matching model encompassesincreasingly more details and the applied representation of the picturedata enables these still relatively coarse structures to be recognisedif they are present in the picture to be examined.

The evaluation process starts on the node of the lowest level whosepicture matching model is matched to the picture data, whereby theparameters of the picture matching model, in particular the parametersfor the positioning of the model, are varied until the best possible (orat least a sufficiently good) match, i.e. exceeding a given thresholdvalue, is achieved with the current picture data. The resultingparameter set describes the rough alignment of the person in thepicture.

With the transition to the son node of the processed node, theassociated picture matching model is preassigned with suitableparameters (parameter initialisation), whereby the parameter valuesresulting from the variation methods carried out for the processed nodeare taken into account. This includes particularly the definition of theposition and the choice of a suitable variation range for the position.

The matching process refines the position of the object and determinessuitable parameters for the additional degrees of freedom whichhighlight this picture matching model from the previous one. The sameprocedure is used with the other nodes on the path to the end node andfor the end node itself.

At the end of the evaluation process all nodes of the chain areprocessed and the picture matching model assigned to them matched to thepicture data. Here, the complete information about the object, such asfor example the position and type of individual parts, may bedistributed over the whole of the picture matching models and theirparameter assignments.

FIG. 1 clearly shows a preferred embodiment of the method according tothe invention for object recognition.

The reference symbols 1, 2, 3 stand for the nodes of the structure graphwhich consists of a simple chain of three nodes in this example. The“node” 0 is not part of the actual structure graph, but instead standssymbolically for the initialisation of the method procedure.

Since with this simple embodiment, one node and the associated level maybe identified together, the same reference symbols are used for the nodeand associated level.

For reasons of better clarity, the picture matching models and stagescorresponding to a node in the drawing level are illustrated in eachcase above the symbol representing the node.

This means that node 1 is processed in step 0→1 and correspondingly node2 in step 1→2 and node 3 in step 2→3.

The left partial picture shows here in each case the matching modelbefore the matching process and the right partial picture shows thematching model after the matching process.

The left partial picture of the first level 1 has been initialised withany parameters. In contrast, the corresponding left partial pictures onlevels 2 and 3 have been initialised with the parameters in each case ofthe right partial picture of the upper adjacent level.

It must be noted that in FIG. 1—as also in the corresponding followingfigures—the hierarchical stages increase from the top to the bottom.

The evaluation method begins with node 1: The picture matching model ofthis node is a very coarse model of faces which essentially representsthe outline of the face. The picture information is also representedvery coarsely, e.g. with low-frequency filters for only two differentdirections. During the matching process, this model is evaluated at allpossible picture points or an adequately selected subset, for example bysampling down, before the best parameter variation for the processing ofthe following levels is applied. Due to the coarse view of the picture,the matching process is carried out on a coarse pitch so that onlyrelatively few possibilities need to be evaluated. The assessment of theindividual possibilities also occurs very fast, because the pictureinformation is represented with only a few filters.

Once the matching process on the lowermost level is concluded, the modelof node 2 is initialised based on the results of the father node 1.Here, in particular the localisation of the model is accepted, i.e. theparameters of node 1 describing the positioning are taken into accountfor its son node 2.

The matching process of the matching model of node 2 only operates on asmall picture extract and essentially carries out local optimisations ofthe results of the first level. With this optimisation the nowadditional available information of the refined model and the moreaccurate representation of the picture is exploited through more andbetter resolved filters.

After the matching of this model the model of node 3 is accordinglyinitialised. The relatively complicated matching of this model nowoccurs only within a very restricted search space and is thereforequickly concluded.

The advantages of the method compared to the conventional graph matchingand its variants lie in the refined control of the overall matchingprocess due to the structure graphs.

The situation is avoided where detailed model and picture informationmust be taken into account for matching the coarse structure; at thisstage they would not contribute to any relevant gain of information, butwould drastically increase the amount of computation. Throughapplication of the structure graph, details are only taken into account(at higher hierarchical levels of the structure graphs) when this ispracticable.

With conventional (classical) graph matching, the consistent applicationof detailed picture information and complex models also often leads tosuboptimum solutions. This disadvantage can also be avoided by themethod according to the invention by applying the structure graphs.

Pose Estimation

FIG. 2 shows an embodiment of the method according to the inventionwhich may in particular be used for pose estimation.

With this embodiment the associated structure graph exhibits asimilarity to a tree with branches.

For each parameter variation an assessment is computed which arises fromthe matching dimensions of the picture matching models of a node and theweightings assigned to them.

The evaluation process is continued with the son nodes of the node forwhich the best assessment of a parameter variation has been attained.The picture matching models of the son nodes of this node arepreassigned with suitable parameters. This particularly includes thepositioning of the picture matching model in the current picture.

All previously assessed parameter variations are in competition.Therefore the evaluation may be continued at a node different from oneof the son nodes of the currently considered node. This always takesplace when the assessments for all parameter variations of the currentlyconsidered node are worse than a previously assessed parametervariation. This procedure enables the structure graph to be evaluatedwithout having to make a decision prematurely which would then possiblylead to a worse or even incorrect result.

As already seen with the first example (FIG. 1), the “start node” 0 isnot an integral constituent part of the structure graph, but rather isonly used for the initialisation of the method. Using this node, it maybe specified, for example, in which sequence the nodes 1, 2, 3 of thelowermost level of the structure graph are to be processed.

In the structure graph a subdivision into simple structures occurs onthe lowermost level 1. With increasing hierarchical stages thestructures become more and more complex until they possess the fullcomplexity for the representation of complete objects at the end nodes,the nodes of level 3.

In order to be able to recognise the head of a person in the picture ina wide spectrum of different poses or head postures, a structure graphis formed in which the picture matching models in the lowermost level 1only represent the coarse head shape from rounded to elongated andcoarse orientation in the plane of the picture.

The models of the next (second lowest in the hierarchy) level 2represent the head shape and a coarse form of the inner structure offaces, such as for example the position of the eye recesses and themouth region.

Level 3 is the uppermost level of the structure graph and exhibits endnodes, the models of which encompass the complete representation of thefaces in various poses.

Matching of the models of the nodes of the lowest level occurs asdescribed in the last example and supplies the assessments given in thepicture, whereby in each case only the assessment for the best parametervariation is given. They are the assessment 0.9 for the node 1 processedin step 0→1, assessment 0.8 for the node 2 processed in step 0→2 and theassessment 0.7 for the node 3 processed in step 0→3.

Based on these assessments the son nodes of node 1, situated on thesecond level of the structure graph, are now processed first, wherebytheir picture matching models are initialised based on the results ofthe matching process for node 1.

After processing the son nodes 4, 5, 6 of node 1, the assessments of 0.7for node 4, 0.75 for node 5 and 0.6 for node 6 result.

These assessments are however worse than the assessment of the nodeprocessed on the first level in step 0→2, the assessment of which is0.8.

Further evaluation of the paths starting with step 0→1 is thereforeinitially withdrawn and first the possible promising processing of theson nodes of node 2 processed in step 0→2 is carried out.

In this respect, the models of the son nodes based on the matching ofthe model of node 2 processed in step 0→2 are initialised and then thematching method is again carried out.

For nodes 7, 8, 9 the matching leads to the assessments 0.65, 0.6 and0.5 given in the picture. Therefore, the best assessment of the sonnodes 7, 8, 9 of the node 2 lies below the best assessment of a son nodeof node 1 processed in step {circle over (0)}→{circle over (1)}.

Consequently, processing of the son nodes of the best assessed node 5 ofthe second level now proceeds. These are the nodes 10-12.

The matching of the associated models supplies the best assessment fornode 11 which with 0.74 is also higher than the best assessment of nodes7-9 of the hierarchical level 2 situated below.

The evaluation of the structure graph can therefore be concluded.

Scene Analysis

Since scenes may have any degree of complexity, a structure graph whichmay be used for scene analysis is generally a very complex formation,whereby there are typically many paths to the end nodes. This means thatthere may be nodes in the graph which possess more than one father node.This enables single objects to be reused as parts of other complexobjects. Consequently, it is not just a meagre representation of theknowledge about objects in structure graphs which arises, but rather theevaluation process profits from it, because the same object parts nolonger compete with one another in different contexts.

This type of scene analysis is now described in the following withreference to FIGS. 3A, 3B and 4A, 4B.

In order to generate a description of a complex scene, a structure graphis used whose paths end at end nodes whose models represent differenttypes of objects.

The lowermost level of such a structure graph contains models whichdifferentiate the objects according to size, orientation and coarsestructure. In the higher levels the objects are subdivided into variousobject classes so that at the end nodes differentiation may take placeaccording to all or a large part of their established characteristics.

FIGS. 3A/3B illustrate picture matching models 100, 200, 300; 110; 210,310, 320, 330, 340; 111, 211, 321, 341 used for a picture analysis onthree different hierarchical levels 1; . . . ;i . . . ;i+j, . . .(1<i<i+j).

FIGS. 4A/4B show a picture analysis which was carried out on a complexscene with the picture matching models illustrated in FIGS. 3A/3B.

As the upper partial picture of FIG. 4A shows, the evaluation processfirst processes the models 100, 200, 300 of the lowermost level 1 withthe methods assigned to them. Here, the objects are classified accordingto their orientation or preferred direction and their rough position inthe picture is determined. For the man shown in the upper partialpicture of FIG. 4A, model 100 fits, for the tree model 200 fits andmodel 300 is the best both for the car and also for the house. With thedecision for a node, the method favours initially objects of theappropriate size and coarse structure of the models of this node.

Further evaluation occurs again through the processing of the models ofall son nodes with the method assigned to them. The selection of thenext node occurs again from the nodes at which the previously assessedpaths terminate. The evaluation however does not terminate with thedetermination of the first complete path, i.e. a path leading to an endnode, but rather is continued until the assessment of the remainingpaths appears to be uninteresting with regard to further evaluation. Theresult of this evaluation therefore consists of a set of complete pathsthrough the structure graph, whereby each complete path belongs to anobject in the picture. The set of these paths therefore corresponds to adescription of the picture as a set of objects and their arrangement.

1. A method of processing digitised picture data by matching of picturematching models, comprising: (i) provision of an hierarchical structuregraph comprising: nodes, wherein each node represents at least oneparameterised picture matching model, a specified number of levels,arranged one above the other, whereby at least one node is located ineach level, edges connecting pairs of predetermined nodes of differentlevels and defining, for each pair of nodes, a father node as the nodein the lower level and a son node as the node in the upper level; (ii)application of the structure graph to the picture data in that, startingwith the lowermost level, at least one node is processed, whereby theprocessing of a node comprises the steps: matching of its at least onepicture matching model to the picture data by variation of the modelparameters, determination of a matching quantity for each parametervariation as a measure of the quality of the picture match, anddetermination of an assessment for each parameter variation, taking intoaccount the at least one determined matching quantity, and whereby theassessment found for each parameter variation is applied as a criterionfor the processing of a son node of the processed node and, if thecriterion is fulfilled, the processing of the son node starts with theinitialisation of its at least one matching model through predeterminedparameters of the father node.
 2. A method according to claim 1, inparticular for the recognition of at least one object of a given objectclass in a picture, whereby the structure graph includes precisely onenode in each level, which at least represents one picture matching modeland a lower threshold value and/or an upper threshold value is assignedto given nodes, whereby the method is terminated with the result, thatno object of the given object class is recognised, if for a node theassessment of each parameter variation lies below the lower thresholdvalue assigned to the node, or that at least one object of the givenobject class is recognised, if for a node the assessment of at least oneparameter variation lies above the upper threshold value assigned to thenode or if the end node is reached.
 3. A method according to claim 1 forthe recognition of the orientation and/or arrangement of elements of anobject of an object class in a picture, in particular for estimating thepose, whereby the structure graph for each orientation to be recognisedand/or arrangement of elements of the object exhibits at least one nodewith at least one picture matching model for the orientation to berecognised and/or arrangement of the elements of the object, and anupper and/or lower threshold value for the assessment of the parametervariations is assigned to given nodes, whereby the processing of sonnodes of those processed nodes is waived for which the assessment ofeach parameter variation lies below the lower threshold value assignedto the relevant node with the result that the corresponding orientationsand/or arrangements of the elements of the object are not present in thepicture, whereby son nodes of those processed nodes are processed forwhich the assessment of at least one parameter variation lies betweenthe upper and lower threshold values assigned to the node, and wherebyprocessing of son nodes for the parameter variations is waived whoseassessment lies above the upper threshold value assigned to the relevantnode with the result that the orientation and/or arrangement of theelements of the object is classified as being present in the picturewhich receives the best assessment on the highest fully processed level.4. A method according to claim 1 for scene analysis through therecognition of objects of different object classes and the arrangementof the objects in a picture: whereby the structure graph for each objectclass exhibits at least one node with at least one picture matchingmodel for the object class and a lower and/or upper threshold value isassigned to given nodes for the assessment of the parameter variations,whereby the processing of son nodes of those processed nodes is waivedfor which the assessment of each parameter variation lies below thelower threshold value assigned to the relevant node with the result thatthe associated object is classified as not being present in the picture,whereby son nodes of those processed nodes are processed for which theassessment of at least one parameter variation lies between the lowerand upper threshold values assigned to the relevant nodes, and wherebythe processing of son nodes for the parameter variations is waived whoseassessment lies above the upper threshold value assigned to the relevantnode with the result that the associated object is classified as beingpresent in the picture.
 5. A method according to claim 1, in which alower threshold value is assigned to each node of the structure graphand the processing of son nodes of a node is waived if each parametervariation produces an assessment below the lower threshold valueassigned to the node.
 6. A method according to claim 1, in which thelower and/or upper threshold value for given nodes is dynamicallyadaptable.
 7. A method according to claim 1, whereby, at leastpartially, parameters of the processed father nodes are accepted for theprocessing of given nodes.
 8. A method according to claim 1, whereby theassessment of the father node is taken into account for the assessmentof given nodes.
 9. A method according to claim 1, in which weightings,which are included in the assessment, are assigned to each picturematching model and/or each edge.
 10. A method according to claim 1, inwhich the picture matching models of the nodes of the uppermost levelare based on digitised reference picture data and/or the picturematching models of predetermined nodes are based on the picture matchingmodels of their son nodes.
 11. A method according to claim 1, in whichthe picture matching models comprise graphs of features which are theresult of the application of predetermined filters on reference picturedata.
 12. A method according to claim 11, in which the features includejets, whereby the scaling of the filters from which the features areobtained becomes smaller from the lower levels to the upper levels inthe structure graph.
 13. A method according to claim 12, in which thematching quantity for the parameter variations is computed according tothe formulaS ^((n)) =f ^((n))(P _(k) ^((n)) j(k,m),P _(k) ^((n)) {tilde over(j)}(k),d ⁽⁰⁾(k,m),d ^((n))(k,m)),k ε K′(n) ⊂ K, whereby: j(k,m) is thejet of the picture matching model m on the node k, {tilde under (j)}(k)is the jet of the picture on the position of the node k, d⁽⁰⁾(k, m) isthe original position of the node k of the picture matching model m,d^((n))(k, m) is the position of the node k of the picture matchingmodel m in step n, f^((n))( . . . ) is a functional of the picturematching model jet and the picture jet at corresponding locations, P_(k)^((n)) represents an image of the jets j(k, m) or {tilde over (j)}(k),and K′(n) is a subset of the set K of all graph nodes k.
 14. A methodaccording to claim 12, in which the features comprise combinations ofvarious types of jet.
 15. A method according to claim 2 for picturecompression and picture sequence compression, comprising the steps:compressing each recognised object with a compression factor specifiedfor the corresponding object class, whereby the control of theparameters of the compression method is based on the parameters of theresults of the scene analysis, and compressing the picture region notoccupied by objects using a higher specified compression factor.
 16. Amethod of decompressing a picture which has been compressed according tothe method described in claim
 15. 17. A method substituting at least oneselected object, recognised according to claim 2, by an objectrepresentative, in particular by an avatar, comprising the steps:providing reference pictures of the object representatives, substitutingthe at least one selected recognised object by the objectrepresentative, whereby part of the parameters of the picture matchingmodels may be used for the control of the object representative.
 18. Amethod of substituting a picture background by an alternate background,comprising the steps: providing at least one reference picture for thealternate background, positioning of the objects recognised according toclaim 2 in the reference picture.
 19. A method, comprising: (i)substituting at least one selected object, recognised according to claim2, by an object representative, in particular by an avatar, comprisingthe steps: providing reference pictures of the object representatives,substituting the at least one selected recognised object by the objectrepresentative, whereby part of the parameters of the picture matchingmodels may be used for the control of the object representative; and(ii) substituting a picture background by an alternate background,comprising the steps: providing at least one reference picture for thealternate background, positioning of the objects recognised according toclaim 2 in the reference picture; whereby at least one selectedrecognised object is substituted by an object representative and thebackground is substituted by the alternate background.
 20. A method ofvisually displaying scenes processed according to claim 17, comprisingthe steps: providing a data base with the object representatives and/orthe reference pictures for the background.
 21. A method according toclaim 16, in which the object representatives comprise real objectsand/or virtual objects.
 22. A method according to claim 1, whereby, forthe processing of individual pictures in a picture sequence, theparameters of the picture matching models are assigned initial valueswhich use part of the parameters from the processing of previouspictures.
 23. A method according to claim 22, whereby the possiblevariations of the parameters of the picture matching models, based on apart of the parameters from the processing of previous pictures, arerestricted.
 24. A method according to claim 13, in which the featurescomprise combinations of various types of jet.